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ABSTRACT: 


This  paper  describes  a novel  directly  executed  language  (DELtran) 
tailored  specifically  to  the  FORTRAN  source  language,  EMMY  host,  and 
scientific  programming.  DELtran  Is  "transformationally  complete"  In 
that: 

1)  Code  generation  Is  linear  with  respect  to  the  number  of 
operators  in  a FORTRAN  program. 

2)  Only  k DELtran  Instruction  units  are  needed  to  represent  a 
FORTRAN  statement  containing  k functional  operators. 

3)  The  space  needed  to  represent  a FORTRAN  statement  approaches 
N*v+F*k  — where  v Is  the  number  of  distinct  variables  In  the 
statement,  and  N and  F are  the  least  Integers  such  that  there  are 
less  than  2**N  distinct  variables  and  2**f  distinct  operators  in 
the  relevant  scope  of  definition. 

In  addition,  DELtran  Is  "transparent"  In  that  there  is  a 1-1 
correspondence  between  DELtran  operators  and  control  constructs  and 
FORTRAN  operators  and  control  constructs,  and  "Invertible"  in  that  all 
sensible  sequences  of  DELtran  Instruction  units  have  a direct  FORTRAN 
analogue. 


The  work  described  herein  was  supported  In  part  by  the  Army  Research 
Of f Ice-Durham  under  Grant  # DAAG-29-76-C-00D1. 


1.  Introduction: 


DELtran  is  an  intermediate  language  tailored  to  a FORTRAN  source 
language,  EMMY  host  machine,  and  typical  community  of  scientific 
programmers.  Its  design  is  intended  to  minimize  execution  phase  time 
and  space,  subject  to  the  limitations  imposed  by  a one  pass 
compilation  that  performs  only  single  statement  optimization.  Our 
primary  objective  in  synthesizing  this  language  is  to  demonstrate  the 
practicality  of  the  design  principles  discussed  in  Hoevel  and  Flynn 
[1],  rather  than  to  advance  the  state  of  the  art  in  FORTRAN  execution, 
however. 

With  this  in  mind,  we  limited  the  magnitude  of  our  task  by 
addressing  only  a subset  of  the  full  FORTRAN  language  (Basic  FORTRAN), 
and  Ignoring  a number  of  questions  relating  to  a production 
environment  such  as  higher  level  data,  task,  and  job  management.  The 
resulting  design  does  not  preclude  extension  to  features  like  named 
COMMON,  additional  data  types  and  structures,  or  random  access 
external  files.  Multiple  named  COMMON  blocks,  complex  variables, 
logical  variables,  relational  operators,  and  simple  syntax 
enhancements  like  an  IF... THEN  construct  could  be  implemented  merely 
by  changing  the  compiler  or  adding  a preprocessor.  Inclusion  of 
character  string  data  types  and  dynamic  storage  allocation  features 
would  require  altering  the  executor,  which  should  not  be  too  difficult 
since  it  is  a table  driven  interpreter.  The  instruction  unit 
structure  and  operand  referencing  mechanism  described  below  should  be 
compatible  with  the  modifications  needed  to  capture  the  full  FORTRAN 
language. 

General  attributes  of  user  communities,  high  level  source 
languages,  and  mlcroprogrammable  host  machines  relating  to  the  DEL 
synthesis  problem  are  discussed  elsewhere  (Hoevel  [2],  Flynn  [3], 
Illffe  [4],  and  Welln  [5]),  and  will  not  be  repeated  here.  It  is 
instructive,  however,  to  consider  those  particular  features  of  our 
experimental  system  that  have  had  the  greatest  Impact  on  the  design  of 
DELtran. 
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The  FORTRAN  subset  of  Interest  here  is  usually  referred  to  as 
Basic  FORTRAN  (Heising  [6],  McClure  [7]),  The  adjective  "basic"  is 
not  applied  lightly;  it  is  indeed  a rudimentary  programming  language. 
This  turns  to  our  advantage,  however,  by  holding  the  size  of  the 
design  problem  within  reason.  Some  assumed  source  language  features 
and  restrictions  affecting  the  design  of  DELtran  are: 

1)  Its  name  space  is  entirely  static,  except  for  the  binding 
of  actual  arguments  to  formal  parameters. 

2)  The  natural  range  for  a scope  of  definition  is  a procedure 
specification  (l.e.,  SUBROUTINE  or  FUNCTION  block). 

3)  Few  primitive  data  types  are  needed  (e.g.,  only  single  and 
double  precision  forms  of  fixed  and  floating  point  numbers). 

4)  Unstructured  program  control  is  permitted  (l.e.,  DO  loops 
need  not  be  one-in  one-out  control  structures). 

5)  Parameters  are  uniformly  passed  "by  reference",  although 
this  is  equivalent  to  "by  copy  value"  when  expressions  are 
used  as  actual  arguments  (this  is  not  required  by  the 
standard,  but  follows  the  long  established  IBM  tradition). 

These  observations  are  extracted  from  the  preliminary  ANS 
specifications  for  FORTRAN  vs.  Basic  FORTRAN  [8].  Immediate 
implications  are:  recursive  procedure  invocation  need  not  be 

supported;  both  global  and  local  storage  can  be  statically  allocated 
during  compilation;  all  type  checking  can  be  performed  during 
compilation  (ignoring  parameters  to  procedures,  as  is  conventional); 
and  program  flow  analysis  can  involve  arbitrarily  complex  constructs. 


I 


J 


2 


Host  Influence: 


The  basic  architecture  of  the  EMMY  host  and  Its  surrounding 
laboratory  environment  are  described  In  Neuhauser  [9]  and  [10].  In 
general,  EMMY  is  a mlcroprogrammable  "universal  host"  with  a 200  ns. 
micro  store  and  an  800  ns.  main  store  (50  and  400  ns.  access  times, 
respectively).  Both  stores  are  32  bits  wide;  4K  words  of  read/write 
micro  store  and  16K  words  of  main  store  were  available  during  the 
development  of  DELtran.  Mass  storage  and  intelligent  console 
functions  are  provided  by  two  cassette  tape  drives  integral  to  a Data 
Point  2200  CRT  terminal.  Unique  host  characteristics  impacting  the 
design  of  DELtran  are: 

1)  Register,  control,  and  main  stores  are  functionally 

partitioned;  i.e.  , different  micro  orders  must  be  used  to 
access  each  of  these  stores. 

2)  All  storage  resources  are  32  bits  wide,  and  may  be 
addressed  on  a 32  bit  word  basis  — main  store  alone  may  be 
treated  as  an  8,  16,  24,  or  32  bit  wide  memory,  and  addressed 
on  8,  16,  or  32  bit  boundaries. 

3)  EMMY's  flexible  field  extraction  operators,  which  include 
double  shift,  are  comparatively  slow  — consuming  as  much 
time  as  two  or  three  arithmetic  or  logical  operations. 

4)  Decisions  must  be  implemented  by  explicit  test  and  branch 
sequences  since  EMMY  has  no  implicit  tagging  capability;  more 
time  is  needed  to  determine  whether  an  address  refers  to  main 
or  micro  store  than  is  needed  to  perform  a main  store  access. 
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5) 


The  basic  addressing  mode  is  zero  offset  register 
Indirection  (l.e. , the  effective  address  is  the  contents  of  a 
micro  register);  multi  register  and/or  offset  indexing  is  not 
available. 

A few  logistic  complications  also  affected  our  design.  During 
coding  and  testing  of  the  DELtran  executor,  only  nascent  program 
support  facilities  and  I/O  substructure  were  available.  As  a result, 
only  a minimal  interface  to  the  external  world  has  been  implemented  — 
all  input  and  output  is  done  through  the  basic  front  panel  display  and 
control  unit,  for  example. 

The  "block  access  unit"  anticipated  in  Neuhauser  [9],  which  was 
to  asynchronously  control  memory-to-memory  transfers,  has  been 
supplanted  by  the  "main  memory  control  unit"  described  in  Neuhauser 
[10].  The  earlier  design  permitted  a single  command  to  invoke  fully 
overlappable  transfer  of  an  entire  multi  word  block  either  within  or 
between  storage  resources;  the  later  design  permits  only  single  word 
transfers  between  different  resources,  at  a per  transfer  cost  of  about 
500  ns.  in  non-over lappable  execution  time.  Because  of  this,  the 
invocation  mechanisms  described  below  differ  somewhat  from  the 
idealized  versions  discussed  in  Hoevel  and  Flynn  [1]. 


User  Influence: 

The  intended  user  community  is  assumed  to  be  composed  of  general 
purpose,  scientific  programmers.  User  characteristics  most  relevant 
to  the  design  of  DELtran  are: 

1)  About  half  the  statements  in  a typical  source  program  deal 

with  program  control,  and  about  half  are  assignment 
statements  (Wichman  [11],  Rossman  [12],  and  Lunde  [13]). 
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2)  The  single,  .aost  frequent  type  of  statement  Is  "A  » B", 

followed  at  some  distance  by  "A  = A + B"  (Knuth  [14]). 

3)  DO  statements  almost  always  use  an  implicit  increment 
(stepping  value)  of  one  (Knuth  [14],  Rossman  [12]). 

4)  Three  distinct  branches  are  usually  specified  for  the 
arithmetic  if  statement  (implied  by  the  distribution  of 
branch  statements  noted  in  Flynn  [15]). 


While  these  assumptions  appear  applicable  to  a variety  of  user 
communities  and  source  languages,  specific  programs  could  deviate  from 
the  implied  statistical  distribution  of  operators,  names,  etc.  A more 
detailed  behavioral  model  could,  of  course,  be  extracted  from 
installation-specific  trace-tape  data. 
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2.  General  Description: 


Due  to  the  sequential  nature  of  FORTRAN,  both  at  the  source  and 
machine  code  level,  a linear  outer  form  is  used.  The  natural  scope  of 
definition  for  source  level  Identifiers  is  the  program  or  subprogram 
— i.e.,  MAIN,  SUBROUTINE,  or  FUNCTION  blocks.  Indeed,  the  lack  of 
any  other  structured  control  units  leaves  little  choice  In  this 
matter,  especially  in  light  of  our  Intent  to  minimize  compilation 
complexity. 

Individual  DELtran  instruction  units  are  broken  down  into 
Independently  encoded  subfields,  of  varying  size,  called  syllables. 
Three  classes  of  syllables  were  required:  operand  syllables,  which 
denote  DELtran  variables  (or  labels);  operator  syllables,  which  denote 
transformation  rules  to  be  applied  to  the  DELtran  data  store;  and 
formlate  syllables,  which  denote  Initializations  to  be  performed  in 
preparation  for  a deferred  operator  syllable  ("formlate"  is  coined 
from  the  familiar  terms  format  and  template,  and  combines  their 
respective  connotations  of  semantic  and  syntactic  specification). 

Word  boundaries  may  be  crossed  immediately  before  or  Immediately 
after  either  operator  or  formlate  syllables:  i.e.,  sequences  of 
operand  syllables  must  lie  within  a single  word.  (operand  lists  for 
n-ary  immediate  operators  such  as  CALL,  READ,  and  WRITE  excepted). 
These  syllables  may  be  combined  in  three  general  syntactic  sequences 
to  form  DELtran  Instruction  units: 

Leading  Operator:  <0P>  [ <A>  [ <B>  [...]  ] ] 

Leading  Formlate:  <F>  [ <A>  [...]  ] <0P> 

Compound:  <F>  [ <A>  [...]  ] <0P>  [ <D>  [...]  ] 

Leading  operator  forms  generally  deal  with  program  control,  involving 
functions  that  do  not  fit  well  within  the  familiar  molds  of  binary  or 
unary  operators  (the  leading  MOVE  operator  is  an  exception;  it  is 
coded  in  this  form  because  of  its  high  frequency  of  occurrence).  The 
leading  formlate  construction  factors  out  the  operand  decode  and  fetch 
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computations  required  by  common  operator  functionalities:  dladlc  (two 
arguments,  one  result);  monadic  (one  argument,  one  result),  and  onadlc 
(no  arguments,  no  result).  The  compound  form  is  used  only  with  a few 
high  order  functionality  operators,  or  with  array  access  primitives 
that  require  information  about  explicit  operand  references  not 
provided  by  the  standard  formlate  Interface.  The  normal  sequence  of 
interpretation  is  for  leading  formlate  constructions: 

Decode  leading  syllable  — extract  5 bit  leading  syllable  from  the 
current  instruction  word  (IW);  and  transfer  control  to  the 
appropriate  interface  routine. 

Form  interface  — extract  all  W bit  operand  reference  syllables; 
fetch  values  of  of  arguments;  compute  address  of  result,  if 
any. 

Decode  operator  — extract  operator  code,  and  transfer  to 
appropriate  semantic  routine. 

Execute  — compute  designated  transformation;  store  result,  if 
any;  and  begin  another  cycle  of  interpretation. 

The  leading  operator  form  bypasses  the  explicit  operand  fetch  and 
interface  formation  steps,  proceeding  directly  to  the  execution  of  the 
designated  function.  In  this  case,  the  appropriate  semantic  routine 
assumes  responsibility  for  fetching,  decoding,  and  accessing  (any) 
operand  references.  This  is  similar  to  the  manner  in  which  deferred 
operators  that  require  additional  operands  fetch,  decode,  and  access 
referands  identified  by  deferred  operand  syllables. 

The  mechanism  for  communicating  Information  between  interface  and 
semantic  routines  consists  of  three  micro  registers:  P,  Q,  and  R. 

For  binary  formlates,  P will  contain  the  value  of  the  left  argument,  Q 
the  value  of  the  right  argument,  and  R the  address  of  the  result. 
Lower  functionality  requirements  are  derived  from  this  standard 
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interface  by  deleting  specifications.  In  the  unary  case,  for  example, 
Q contains  the  value  of  the  only  (and  hence  still  right  most) 
argument,  and  R the  result  address. 

This  "PQR"  interface  has  meaning  only  within  the  interpretation 
of  a leading  formlate  or  compound  type  of  instruction  unit.  Some 
residual  control  information,  called  the  DEL  program  state  vector, 
must  be  maintained  across  instruction  interpretations,  however.  The 
internal  DELtran  program  state  is  defined  by: 

1)  Instruction  Word  (IW):  a buffer  for  the  DELtran 

instruction  stream. 

2)  Instruction  Pointer  (IP):  a pointer  to  the  next  word  of 
instruction  units  in  the  DELtran  program  store. 

3)  Control  Pointer  (CP):  a pointer  to  a linear  definition 
table  for  all  accessable  labels,  variables,  constants,  etc. 

A)  Stack  Pointer  (SP):  a pointer  to  the  top  of  a dynamic 

evaluation  stack. 

5)  Syllable  Width  (W):  a specification  for  the  number  of 

bits  in  an  operand  reference  syllable. 

6)  Evaluation  Stack  (ES):  a LIFO  queue  containing  the 

results  of  intermediate  computations. 

Five  of  these  six  entitles  are  encoded  in  three  micro  registers;  the 
current  instruction  word  is  kept  in  micro  register  I,  and  the  current 
instruction  pointer  is  kept  in  micro  register  IP.  The  control  pointer 
CP,  the  current  stack  pointer  SP,  and  the  current  syllable  width  W are 
all  encoded  in  a single  micro  register  S: 


I 

I 

I 

i 
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I SP  I unused  1 CP  : W 1 

———— 4* 

31  26  11  5 0 

This  assignment  leaves  four  micro  registers  available  for  general  use. 
Three  of  these  (P,  Q,  and  R)  are  temporarily  dedicated  to  the  "PQR" 
interface  when  interpreting  leading  formlate  or  compound  instruction 
forms;  but  may  be  reassigned  when  the  standard  interface  is  not 
required.  The  remaining  micro  register,  X,  is  used  for  general 
purpose  indexing  and  scratch  storage. 

The  association  between  DELtran  operand  references  and  referands 
in  the  data  or  program  stores  is  defined  by  a single  linear  table 
called  the  current  contour.  Each  element  of  this  table,  called  a 
descriptor,  contains  two  pieces  of  information  — a shape  and  a 
locator.  Shape  specifiers  (high  8 bits)  define  an  entity's  size, 
justification,  and  the  granularity  of  its  locator,  but  not  its  logical 
data  type  in  the  classical  sense.  Locators  (low  24  bits)  are  directly 
the  address  of  a referand  in  EMMY's  main  store. 

The  current  contour  is  physically  divided  into  two  parts;  a data 
table  located  at  the  bottom  of  micro  store;  and  a label  table  located 
at  the  top  of  micro  store.  Since  the  current  contour  is  always 
located  in  a fixed  position,  a dynamic  environment  pointer  (l.e.,  the 
ep  in  Johnston's  Contour  Model  [19]),  is  not  required  — the  control 
pointer  serves  as  an  environment  pointer  for  CALL  and  RETURN,  but  is 
not  normally  used  to  interpret  DELtran  reference  codes. 

Because  it  is  possible  to  distinguish  between  references  to 
variables  and  references  to  labels  syntactically  (for  the  given 
FORTRAN  source  language),  judicious  placement  of  descriptors  can 
reduce  the  number  of  bits  required  in  operand  syllables.  An  operand 
reference  code  N denotes  the  descriptor  at  location  N if  it  refers  to 
a variable,  and  the  descriptor  at  location  -2**W4N  if  it  refers  to  a 
label  — where  W is  the  number  of  bits  in  an  operand  reference,  and 
micro  store  is  treated  as  a circularly  addressable  memory.  This  means 
that  W may  in  fact  be  the  least  integer  such  that  there  are  less  than 
2**W  distinct  labels  and  less  than  2**W  distinct  variables,  rather 
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than  the  least  integer  such  that  there  are  less  than  2**W  distinct 
entities  (both  labels  and  variables)  in  a given  scope  of  definition. 
This  addressing  scheme  is  illustrated  below: 

DELtran  Reference  Structure 


Micro  Store 


Main  Store 


0 


descriptors 
for  variables 


Executor 


— > 


Image  of  DELtran 
Data  Store 


(unallocated  storage) 


-2**W  — > +- 


——————— 

— 

h > 

/ 

descriptors 

/ 

Image  of  DELtran 

for  labels 

\ 

Program  Store 

64K  — > H 

— 

— 

This  figure  also  illustrates  the  general  layout  of  DELtran  programs  in 
EMMY's  main  store;  with  COMMON  and  LOCAL  storage  allocated  just  above 
the  64  word  evaluation  stack,  and  program  modules  allocated  at  the 
upper  end  of  main  store.  If  more  than  one  procedure  is  Included  in  a 
module,  COMMON  is  extended  toward  the  higher  addresses  and  LOCAL  for 
the  n+lth  procedure  is  allocated  just  above  that  for  the  n-th 
procedure  (MAIN  is  the  1st  procedure).  The  actual  bodies  and  skeletal 
contours  for  procedures  are  allocated  beginning  at  the  high  end  of 
main  store  and  moving  toward  the  lower  addresses.  This  is  Identical 
to  the  storage  allocation  strategy  used  by  McClure  [7],  except  for  an 
inversion  of  addresses  and  the  fact  that  we  limit  our  evaluation  stack 
to  64  elements. 

The  current  contour  is  initialized  by  the  CALL  and  RETURN 
operators  from  skeletal  contours  pre-allocated  during  compilation. 
There  is  one  skeletal  contour  for  each  seperate  scope  of  definition; 
i.e.,  for  each  SUBROUTINE  or  FUNCTION  (Including  MAIN).  Each  skeletal 
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contour  consists  of  a label  definition  table,  linkage  area,  and  a data 
definition  table: 


descriptor  for  label  L 


Callee's  CP  — > +■ 


I 

+• 


descriptor  for 

label  2**W-1 

Table  of  Contents 

Caller's 

CP 

Caller's 

IP 

Caller's 

IW 

descriptor  for 

variable  1 

1 descriptor  for  variable  V | 

H + 


The  table-of-contents  word  defines  the  number  of  formal  parameters, 
dynamic  (overlay)  variables,  static  variables,  and  label  descriptors 
for  the  associated  block.  The  "Caller's  ..."  words  in  the  linkage 
contain  the  DELtran  program  state  vector  elements  that  must  be 
restored  upon  encountering  a RETURN  instruction.  Skeletal  contours 
are  themselves  identified  by  the  "-1th"  word  of  a DELtran  module;  the 
"0th"  word  contains  the  returned  value,  if  it  is  a FUNCTION;  while  the 
"1st"  word  is  the  actual  beginning  of  the  executable  code  for  the 
module: 


i 

! 

i 

I 
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Layout  of  a OELtran  Module  (F) 


I Contour  Pointer  for  F | 

Descriptor  for  F — > + + 

I Returned  Value  I 

Initial  IP  for  F — > H h 

1 1st  Instruction  Word  for  F I 

4- 


I Last  Instruction  Word  of  F | 


Letting  the  descriptor  for  a FUNCTION  module  identify  the  referand  of 
its  returned  value,  as  well  as  its  entry  point,  helps  to  minimize  the 
number  of  distinct  entitles  in  a given  scope  of  definition. 

I 

I 

! 
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i: 

I 


I 


i 

i 


3.  Syllable  Descriptions 


All  DELtran  instruction  units  begin  with  a 5 bit  leading 
syllable.  The  32  distinct  codes  for  this  key  syllable  specify  either 
an  immediate  operation  or  a formlate  that  describes  the  preliminary 
processing  required  to  establish  the  standard  interface  for  a deferred 
operator. 


Five  Bit  Lead  Syllable  Encoding 


Code 

Immediate  Syntax 

Immediate  Semantics 

00000 

10000 

FETCH 

MOVE  <A>  <B> 

fetch  next  instruction 
b a 

01000 

TT  <0P> 

t :=  0P( 

11000 

AB  <A>  <B>  <0P> 

b ;=•  0P( 

a) 

00100 

“TA  <A>  <0P> 

a :=•  0P( 

it) 

[aS 

01100 

AS  <A>  <0P> 

s : “ 0P( 

10100 

_AA  <A>  <0P> 

a :•=  0P( 

ia) 

11100 

<0P> 

execute 

OP 

00010 

UTU 

<0P> 

U 

a 

OP 

) 

00110 

UTA 

<0P> 

a 

a 

OP 

Uft) 

01010 

ATT 

<A> 

<0P> 

t 

a 

OP 

a,t) 

OHIO 

TAT 

<A> 

<0P> 

t 

a 

OP 

it, a) 

10010 

ABS 

<A> 

<B>  <OP> 

s 

a 

OP 

;a,b) 

10110 

ABC 

<A> 

<B>  <C>  <0P> 

c 

a 

OP 

,a»b) 

11010 

TAB 

<A> 

<B>  <0P> 

b 

* 

OP 

,t,a) 

11110 

ATB 

<A> 

<B>  <0P> 

b 

a 

OP 

.a,t) 

00001 

ABA 

<A> 

<B>  <0P> 

a 

a 

OP 

,a,b) 

00011 

ABB 

<A> 

<B>  <0P> 

b 

a 

OP 

.a,b) 

00101 

ATA 

<A> 

<0P> 

a 

a 

OP 

a,t) 

00111 

TAA 

<A> 

<0P> 

a 

a 

OP 

,t,a) 

01001 

AAS 

<A> 

<0P> 

s 

a 

OP 

!a,a) 

01011 

AAB 

<A> 

<B>  <0P> 

b 

a 

OP 

a, a) 

01101 

AAA 

<A> 

<0P> 

a 

- 

OP 

ia,a) 

01111 

10001 

10011 

10101 

10111 

11001 

11011 

11101 


CALL  n <F>  <A1>  ...  <An> 

RETURN 

GO  <L> 

CGO  <I>  <L> 

IFE  <E>  <L> 

IFT  <L> 

ENDO  <N>  <I>  <M>  <L> 

ENDl  <N>  <M>  <L> 


Invoke  F(A1,  ...,  An) 
return  from  invocation 
goto  1 

goto  (1+i-l) 
goto  ^+(e“6)+2*fe>0) ) 
goto  Q+(t"0)+2*(t>0) ) 
goto  1 if  n“n+i  < m 
goto  1 if  n“n+l  < m 


11111  BREAK 


trap  to  monitor  function 


This  listing,  which  is  in  "trailing  zeros"  order,  uses  the  same 
general  notation  as  in  the  formlate  discussion  in  Hoevel  and  Flynn 
[1].  Generic  syllables  are  enclosed  in  angle  brackets  "<>",  %rhlle 
specific  codes  are  not;  a three  character  mnemonic  system  is  used  to 
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Identify  formlate  structure. 

The  first  letter  designates  the  operand  to  be  associated  with  the 
left  argument  of  a binary  operator;  the  second  letter  designates  the 
operand  to  be  associated  with  the  right  argument  of  a binary  (or  only 
argument  of  a unary)  operator;  and  the  third  letter  designates  the 
operand  to  be  associated  with  the  result. 

The  following  character  code  is  used  to  signify  particular 
operand  bindings.  A,  B,  and  C denote  the  explicit  reference  codes 
appearing  in  the  first,  second,  and  third  explicit  operand  syllables 
following  a formlate;  the  variables  identified  by  these  codes  are 
designated  by  the  corresponding  lower  case  letters.  S,  T,  and  U 
denote  the  top  elements  of  an  implicit  evaluation  stack;  T corresponds 
to  the  current  top  of  this  stack,  U to  the  position  Just  below  T;  and 
S to  the  (unused)  position  just  above  T.  Again,  lower  case  letters 
denote  the  values  of  these  referands.  An  underscore  denotes  an 
argument  or  result  position  that  is  not  used. 

In  practice,  lead  syllable  codes  are  extracted  from  the  residual 
instruction  word  register  (I)  using  the  double  shift  technique,  and 
then  added  to  the  microprogram  counter  ($)  to  effect  an  indexed 
branch.  In  EMMYXL  notation  (Hedges  [16]); 

X :»  0 .clear  index  register  X 

. .possible  intervening  code 

X,I  <<  5 ; $ = $+X  .extract  lead  syllable 
(table  of  entry  points) 

Instruction  units  in  the  table  following  the  extraction  may  perform 
useful  computations  as  well  as  transfer  microprogram  control  to  the 
remaining  body  of  the  appropriate  routine  (due  to  the  semihorizontal 
nature  of  EMMY's  native  language;  see  Neuhauser  [10]). 

The  DELtran  formlate  set  permits  full  exploitation  of  repeated 
operands  (either  as  arguments  alone,  or  in  combination  with  the  result 
specification),  and  is  "transformationally  complete"  in  the  sense  that 
any  binding  of  explicit  operands  (i.e.,  primitive  variables)  and 
implicit  operands  (i.e. , stack  elements)  can  be  generated  by  local 
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combinatorial  analysis  of  the  FORTRAN  source  code  (Flynn  [3],  and 
Hoevel  and  Flynn  [1]).  Note  also  that  deferred  operators  are 
partitioned  disjoint  classes  according  to  their  functionality  by  the 
Inclusion  of  distinct  binary,  unary,  and  nullary  formlates;  and  that 
reverse  forms  of  deferred  operators  are  not  needed,  since  all  required 
argument  permutations  are  contained  In  the  formlate  set. 

The  MOVE  operator  simply  transfers  the  value  of  the  referand 
Identified  by  operand  reference  <A>  to  the  referand  Identified  by 
operand  reference  <B>.  Simple  program  control  operators  such  as  GO, 
CGO,  IFT,  and  IFE  cause  the  current  instruction  word  register  (I)  to 
be  reloaded  from  the  bit  address  In  DELtran's  program  store  identified 
by  the  appropriate  label  descriptor.  The  single  label  reference 
appearing  In  the  CGO,  IFT,  and  IFE  constructs  Is  actually  the  first 
entry  In  a subtable  of  the  current  contour;  the  data  dependent  Index 
Into  this  table  Is  determined  by  the  semantic  routine  for  each  of 
these  operators. 

ENDO  and  ENDl  operators  cause  the  value  Identified  by  <N>  to  be 
Incremented  and  then  execute  a GO  <L>  if  the  result  is  less  than  or 
equal  to  <M>.  The  increment  value  is  assumed  to  be  one  for  the  ENDl 
operator,  but  Is  explicitly  denoted  by  <I>  for  the  more  general  ENDO 
operator.  Breaking  out  the  special  case  of  ENDl  is  indicated  not  only 
by  the  default  specification  rule  for  FORTRAN  looping  constructs,  but 
also  by  emperlcal  user  statistics  (Knuth  [14]). 

CALL  and  RETURN  operators  are  somewhat  more  complicated,  since 
they  Involve  modification  of  the  internal  state  of  the  DELtran 
executor.  CALL  causes  the  volatile  portion  of  the  current  contour  to 
be  paged  out  to  Its  static  Image,  which  Is  Identified  by  the  control 
pointer  CP.  The  Instruction  pointer.  Instruction  word,  and  status 
registers  (IP,  I,  and  S)  are  also  saved  In  a linkage  area  within  this 
skeletal  contour,  thus  saving  the  DELtran  program  status  vector  and 
hence  all  Information  needed  to  resume  the  caller's  process.  The 
skeletal  contour  for  the  callee  is  then  moved  into  the  current 
contour,  and  descriptors  for  formal  parameters  copied  into  the 
appropriate  locations.  The  IP  is  set  to  point  to  the  first  word  of 
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the  callee's  program  body,  the  first  Instruction  word  is  fetched,  and 
the  state  register  S is  loaded  with  the  callee's  syllable  width  and 
control  pointer. 

RETURN  simply  undoes  a CALL.  Only  those  descriptor  elements  in 
the  original  caller's  skeleton  contour  which  were  overlayed  during  the 
CALL  operation  need  be  restored,  however.  These  are  easy  to  determine 
by  comparing  the  upper  and  lower  reference  indlce  bounds  for  both 
programs,  which  are  stored  in  a linkage  area  in  their  skeleton 
contours.  We  save  and  restore  the  contents  of  the  caller's  old 
instruction  word  register  to  avoid  wasting  static  space  in  the  DELtran 
program  store;  the  time  required  to  perform  this  linkage  is  greater 
than  that  which  would  be  required  simply  to  fetch  a new  Instruction 
word  from  the  program  store. 

Operand  Syllables: 

As  noted  above,  the  width  of  operand  syllables  may  vary  from  one 
scope  of  definition  to  another.  The  current  number  of  bits  in  an 
operand  syllable,  W,  is  maintained  in  the  low  order  six  bits  of  the 
DELtran  secondary  state  register,  S,  which  is  automatically  saved  and 
restored  by  the  execution  semantics  for  CALL  and  RETURN.  For  short 
subroutines  or  functions,  only  three  or  four  bits  are  needed  to 
identify  a unique  variable;  in  larger  modules,  however,  six  to  eight 
bits  may  be  needed. 

The  map  from  reference  codes  to  descriptors  for  DELtran  variables 
Is  simple  and  direct:  the  descriptor  for  variable  with  reference  code 
N is  located  at  address  N in  micro  store.  It  is  possible  to  extract 
an  operand  reference  and  look  up  the  corresponding  descriptor  in  a 
single  EMMY  instruction,  which  would  appear  in  the  EMMYXL  notation  as: 

X,I  « S ; R - M(X) 

where  X is  a previously  cleared  index  register,  I Is  the  current 
instruction  word  register,  and  R is  a micro  register  that  is  to 
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contain  the  descriptor  value.  The  low  order  bits  of  micro  register  S, 
which  contain  the  current  value  of  W,  Indirectly  govern  the  extent  of 
the  double  shift  from  I Into  X Indicated  In  the  first  half  of  the 
Instruction.  The  second  half  of  the  Instruction  causes  the  micro 
store  word  at  the  location  Indicated  by  the  low  order  twelve  bits  of 
the  Index  register  to  be  loaded  Into  R. 

The  map  from  reference  codes  Into  label  descriptors  Is  somewhat 
more  complicated  to  explain,  but  equally  easy  to  calculate:  the 
descriptor  for  the  label  whose  reference  code  Is  L Is  located  at 
-2**W+L,  viewing  micro  store  as  a circularly  addressed  memory  (the 
absolute  address  Is  4095-2**W+L,  but  since  EMMY's  hardware  Ignores  the 
upper  20  bits  of  a micro  store  address,  our  circular  model  Is  valid). 
The  same  micro  Instruction  used  to  associate  variable  reference  codes 
with  variable  descriptors  can  be  used  for  labels,  although  the  index 
register  X must  be  Initialized  to  minus  one. 

Descriptors  for  variables  consist  of  an  8 bit  shape  specification 
In  the  high  order  byte,  which  Is  actually  a command  code  to  the  memory 
control  unit  that  specifies  the  width  of  the  entity  In  question  (8, 
16,  24,  or  32  bits),  together  with  a 24  bit  locator  In  the  low  order 
bytes.  The  shape  designator  also  specifies  the  granularity  for  the 
locator  (8,  16,  or  32  bit  quanta);  the  locator  directly  identifies  the 
main  store  Image  of  a referand  In  the  DELtran  data  store. 

Descriptors  for  labels  are  similarly  structured,  but  In  this  case 
the  locator  Is  actually  a bit  address  In  the  DELtran  program  store. 
The  target  Instruction  word  must  be  shifted  after  loading  to  obtain 
proper  alignment.  The  granularity  specifiers  within  the  shape  code 
are  used  to  minimize  the  magnitude  of  this  shift. 


Deferred  Operator  Syllables: 

Deferred  operators  are  categorized  as  diadlc  (two  arguments,  one 
result),  modadlc  (one  argument,  one  result),  or  onadlc  (no  arguments, 
no  results).  Data  types  are  not  checked  dynamically  because  FORTRAN 
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is  such  a strongly  typed  language  In  Its  own  right,  and  hence  distinct 
operator  codes  are  used  to  denote  Integer  and  floating  functions. 
Some  collapsing  of  the  DEL  operator  set  was  possible  where  only  the 
sign  of  an  operand  or  equivalence  to  zero  need  be  checked,  as  with  the 
IF  statement,  since  these  representations  are  the  same  for  both  fixed 
and  floating  point  (internal  value  representation  consistent  with  the 
370  architecture  has  been  used  for  pragmatic  reasons;  see  Wallach 
(17]). 

Deferred  operator  syllables  are  decoded  in  the  same  manner  as 
leading  syllables,  except  that  different  branch  tables  are  used  (one 
for  4 bit  binary  operator  codes,  one  for  4 bit  unary  operator  codes, 
and  one  for  3 bit  nullary  operator  codes). 

Four  Bit  Encoding  of  Dladlc  Operators 
Code  Deferred  Syntax  Deferred  Semantics 


(floating  add) 
(Integer  add) 
(floating  subtract) 
(integer  subtract) 
(floating  multiply) 
(integer  multiply; 


Code 

Deferred  Syntax 

0000 

FETCH 

fetch  the 

1000 

A2E  <D> 

associate 

0100 

F+ 

r p+q 

1100 

1+ 

r :»  p+q 

0010 

¥- 

r ! “ P“q 

0110 

I- 

r :=•  p-q 

1010 

F* 

r p*q 

1110 

I* 

r :«  p*q 

0001 

-A2- 

prefix  to 

" 00 

•e  r\  % 

MA2  <D> 

r(p.q)  := 

TA2 

A2S 

V, 

F^F 

I~I 

FST 

1ST 

BREAK 


<1  r(p,q) 
r(p,q)  :=  t 
s :»  r(p,q) 
r p/q  i 
r :•  p/q  ( 
r :=  p**q  ( 


"q  (Inte 

r :=  sgn(p)*q  (floating  sign  transfer 
r :»  sgn(p)*q  (integer  sign  transfer) 
trap  to  monitor 


floating  divide) 

Integer  divide) 

floating  to  floating  power) 

Integer  to  Integer  power) 

'q  (floating  sign  transfer) 


The  -A2-  operators  are  perhaps  not  self  defining;  In  general,  the 
two  argument  values  In  the  P and  Q interface  registers  to  be  treated 
as  the  first  and  second  subscripts  for  the  array  whose  descriptor  will 
be  In  the  result  register,  R.  A2E  causes  the  effective  address  of  the 
Indicated  array  element  to  be  computed,  creates  a descriptor  to  this 
element  by  combining  the  shape  field  from  the  array  descriptor  with 
this  address,  and  stores  the  result  in  the  contour  slot  for  the 
deferred  reference  code  D.  MA2  and  A2M  operators  work  in  a similar 


fashion,  but  actually  cause  a state  transformation  in  the  DELtran  data 
space  — they  are  similar  to  the  MOVE  operator.  TA2  and  A2S  are 
"push"  and  "pop"  operators  that  transfer  values  between  the  evaluation 
stack  and  array  elements. 

Initially,  we  intended  to  perform  array  accessing  implicitly  by 
dynamically  checking  the  structural  type  of  each  variable  descriptor 
before  using  it  to  load  or  store  a value.  However,  without  specific 
hardware  support  this  proved  to  be  too  inefficient.  Substantial  code 
compaction,  as  well  as  execution  time  reduction  for  array  accesses, 
may  be  possible  in  systems  based  on  a tagged  architecture  host,  such 
as  described  in  Feustel  [18]  and  lliffe  [4]. 

Bounds  checking  is  not  performed,  following  with  the  tradition 
established  by  IBM.  It  would  be  easy  to  incorporate  by  modifying  the 
appropriate  array  accessing  routines,  and  would  not  Involve  a ilgh 
space  or  time  penalty  for  the  EMMY  host.  The  multiplier  needed  to 
compute  the  effective  address  of  an  indexed  array  element  is  stored  at 
the  "base"  of  the  array  (l.e.,  is  its  zero-th  element;  this  works  for 
FORTRAN  since  array  subscripts  must  begin  with  one). 

Four  Bit  Encoding  of  Modadic  Operators 


Code  Deferred  Syntax 


0000 

1000 


FETCH 
AIE  <D> 


0100 

FLOAT 

r 

1100 

FIX 

r 

0010 

F' 

r 

0110 

I' 

r 

1010 

LOG 

r :» 

1110 

SIN 

r 

0001 

-Al- 

pref  i: 

" 00 

MAl  <D> 

r(p) 

" 01 

AIM  <D> 

d : “ 

" 10 

TAl 

r (p) 

" 11 

AIS 

s 

0011 

COS 

r :» 

0101 

TANH 

r : “ 

0111 

PAUSE 

pause 

1001 

STOP 

stop  ' 

1011 

TIME 

r 

1101 

not  used 

1111 

BREAK 

trap 

Deferred  Semantics 

fetch  new  instruction  word 
associate  reference  code  D with  r(p) 
float (p) 
fix(p) 

-p  (floating  negate) 

-p  (integer  negate) 
log(p)  (logarithm) 
sin(p)  (sine) 
prefix  for  array  accessing  operators 
' ' d 
r(p) 

t 

r(p) 

cos(p)  (cosine) 
tanh(p)  (hyperbolic  tangent) 
with  code  p 
with  code  p 
(current  time)-p 


I 
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The  "Al"  array  oriented  operators  are  just  like  the  "A2" 
operators  described  above,  except  that  only  a single  subscript  is 
required  (the  argument  value  in  the  P register  of  the  standard 
Interface).  Again  In  the  IBM  tradition,  no  bounds  checking  is 
performed.  The  TIME  function,  although  not  required  by  the  ANS 
specification  [8],  is  included  in  order  to  facilitate  experimental 
evaluation  of  the  final  system. 

Some  compression  of  the  operator  set  suggested  by  the  semantics 
of  Basic  FORTRAN  has  been  obtained  by  noting  a few  non-trlvlal 
algebraic  relations.  In  particular,  the  EXP  function  can  be  replaced 
by  the  binary  F^F  operator  — l.e.,  instead  of  generating: 

_XY  ( <A>  [ <B>  ] ] EXP 

we  generate: 

AX'Y'  <e>  [ <A>  [ <B>  ] ] F^F 

(where  X'Y'  is  derived  from  XY  by  transforming  A->B  and  B->C).  This 
is  the  same  as  the  observation  that  EXP(x)  can  be  rewritten  as  E**(x), 
where  E is  a constant  with  value  2.718...,  for  any  expression  x. 

Three  Bit  Encoding  of  Onadlc  Operators 

Deferred  Syntax  Deferred  Semantics 

FETCH  fetch  next  instruction  word 

SET  <U>  <F>  set  Unit  “ U and  Format  “ F 

READ  n <D1>. .,<Dn>  input  to  Dl...Dn  as  per  Unit/Format 

WRITE  n <Dl>...<Dn>  output  from  Dl...Dn  as  per  Unit/Forraat 
R^IMD  rewind  Unit 

BACKSPACE  backspace  Unit 

ENDFILE  write  end-of-flle  mark  on  Unit 

BREAK  trap  to  monitor 


Code 

000 

100 

010 

no 

001 

Oil 

101 

111 


1 


i 

‘j 


Compound  instruction  units  of  the  form  " <onadlc  0P>  ...  " are 

really  nothing  more  than  a partial  frequency  encoding  of  infrequent 
and/or  difficult  to  handle  functions,  the  bulk  of  which  deal  with 
input  output.  Two  residual  control  cells.  Unit  and  Format,  are  used 
to  maintain  the  status  of  I/O  operations.  Unit  corresponds  to  a 
logical  designation  of  a specific  flle/devlce/channel  combination,  and 
would  in  practice  be  bound  by  a surrounding  operating  system  as 
specified  by  some  external  Job  control  language.  The  Format  cell  is 
merely  a byte  pointer  into  a string  of  field  specifications  produced 


20 


during  compilation  from  the  appropriate  FORTRAN  format  statement. 


Encoding  of  Format  Control  (I/O)  Operators 
FORTRAN  DELtran  Construct 


Construct 

Symbolic 

Actual 

\. 

n( 

( n 

0 

n 

: 

i 

f 

Fw.d 

F w 

d 

1 

w d 

1- 

Ew.d 

E w 

d 

2 

w d 

In 

I n 

3 

n 

[ 

f 

/^/.../  (n  slashes) 

X n 
/ n 

4 

5 

n 

n 

: 

nHab...  (n  chars.) 

H n 

d b • • • 

6 

n a b • • • 

1 

n. . . (repeat  count) 

) 

REP 

) 

n 

7 

8 

n 

s 

Although  the  full  I/O 

structure  indicated  above 

has  not  yet  been 

i 

Implemented,  the  intent 

Is 

that  it  should 

proceed  as  a 

subinterpretation,  either  with  EMMY  performing  conversions  under 


control  of  the  current  Format,  or  with  the  control  device  for  Unit 
performing  these  conversions  asynchronously.  The  Unit  and  Format 
residual  control  cells  are,  respectively,  the  environment  pointer  and 
instruction  pointer  for  this  sub Interpretation.  An  entire  byte  is 
used  to  encode  formatted  field  specifications  simply  to  keep  this 
process  as  simple  as  possible;  the  spatial  penalty  is  low  since  I/O 
statements  are  statically  Insignificant. 


I 
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4.  Examples: 

A few  examples  may  help  clarify  the  preceeding  discussion 


FORTRAN  Statement 

1)  A = B 

2)  I = J-I 

3)  I = J*J  + I 

4)  GOTO  10 

5)  DO  10  I = 1,  100 

A = F(A,I) 

10  CONTINUE 

6)  IF  (A-B)  1,2,3 

7)  WRITE  (6, 10)  N,M 

8 10  FORMAT  (IH  ,215)  #10 

9)  I = A(I,I) 


DELtran  Equivalent 


MOVE  <B>  <A> 

ABB  <J>  <I>  I- 

AAS  <J>  I* 

TAA  <I>  1+ 

GO  <#10> 

MOVE  <1>  <I> 

#10  CALL  2 <F>  <A>  <I> 

MOVE  <F>  <A> 

ENDl  <I>  <100>  <#10> 

ABS  <A>  <B>  F- 
IFT  <#1> 

SET  6 10 
WRITE  2 <N>  <M> 

<(>  <H>  1 <R>  2 <I>  5 <)> 

AAB  <I>  <A>  A2S 
TA  <I>  FIX 
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